Geographic Data ¶
import xarray as xr
import hvplot.pandas
import hvplot.xarray
import geoviews as gv
import cartopy.crs as ccrs
from bokeh.sampledata.airport_routes import airports
Installation ¶
The plot API also has support for geographic data built on top of Cartopy and GeoViews. Both can be installed using conda with:
conda install -c pyviz geoviews
or if the cartopy dependency has been satisfied in some other way, GeoViews may also be installed using pip:
pip install geoviews
Usage ¶
To declare a geographic plot we have to supply a
cartopy.crs.CRS
(or coordinate reference system). Coordinate reference systems are described in the
GeoViews documentation
and the full list of available CRSs is in the
cartopy documentation
. Only certain hvPlot types support geographic coordinates, currently including: 'points', 'polygons', 'paths', 'image', 'quadmesh', 'contour', and 'contourf'.
As an initial example, consider a dataframe of all US airports (including military bases overseas):
airports.head(3)
Declaring a coordinate system ¶
If we want to overlay our data on geographic maps or reproject it into a geographic plot, we can set
geo=True
, which declares that the data will be plotted in a geographic coordinate system. The default coordinate system is the
PlateCarree
projection, i.e., raw longitudes and latitudes. If the data is in another coordinate system, you will need to declare an explicit
crs
as an argument, in which case
geo=True
is assumed. Either way, once hvPlot knows that your data is in geo coordinates, it can be overlaid on top of
geoviews.tile_sources
and
geoviews.features
:
crs = ccrs.PlateCarree()
gv.tile_sources.ESRI * airports.hvplot.points(
'Longitude', 'Latitude', geo=True, color='red', alpha=0.2, height=500,
xlim=(-180, -30), ylim=(0, 72)
)
Since a GeoPandas
DataFrame
is just a Pandas DataFrames with additional geographic information, it inherits the
.hvplot
method. We can thus easily load shapefiles and plot them on a map:
import geopandas as gpd
cities = gpd.read_file(gpd.datasets.get_path('naturalearth_cities'))
gv.tile_sources.Wikipedia * cities.hvplot(global_extent=True, width=500, height=450)
The GeoPanas support allows plotting
GeoDataFrames
containing
'Point'
,
'Polygon'
,
'LineString'
and
'LineRing'
geometries, but not ones containing a mixture of different geometry types. Calling
.hvplot
will automatically figure out the geometry type to plot, but it also possible to call
.hvplot.points
,
.hvplot.polygons
, and
.hvplot.paths
explicitly.
When plotting polygons it will automatically color by a dimension, but it is also possible to declare a specific column with the
c
keyword:
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
world.hvplot(width=550) + world.hvplot(c='continent', width=500)
Declaring an output projection ¶
The
crs=
argument specifies the
input
projection, i.e. it declares how to interpret the incoming data values. You can independently choose any
output
projection, i.e. how you want to map the data points onto the screen for display, using the
projection=
argument. After loading the same temperature dataset explored in the
Gridded Data
section, the data can be displayed on an Orthographic projection:
air_ds = xr.tutorial.load_dataset('air_temperature')
air_ds.hvplot.quadmesh(
'lon', 'lat', 'air', crs=crs, projection=ccrs.Orthographic(-90, 30),
global_extent=True, width=600, height=540, cmap='viridis'
) * gv.feature.coastline
Note that when displaying raster data in a projection other than the one in which the data is stored, it is more accurate to render it as a
quadmesh
rather than an
image
. As you can see above, a QuadMesh will project each original bin or pixel into the correct non-rectangular shape determined by the projection, accurately showing the geographic extent covered by each sample. An Image, on the other hand, will always be rectangularly aligned in the 2D plane, which requires warping and resampling the data in a way that allows efficient display but loses accuracy at the pixel level. Unfortunately, rendering a large QuadMesh using Bokeh can be very slow, but there are two useful alternatives for datasets too large to be practical as native QuadMeshes.
The first is using the
datashade
or
rasterize
options to regrid the data before rendering it, i.e., rendering the data on the backend and then sending a more efficient image-based representation to the browser:
rasm = xr.tutorial.load_dataset('rasm')
rasm.hvplot.quadmesh(
'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.PlateCarree(),
ylim=(0, 90), width=800, height=400, cmap='viridis', rasterize=True
) * gv.feature.coastline
Another option that's still relatively slow but avoids sending large data into your browser is to plot the data using
contour
and
contourf
visualizations, generating a line or filled contour with a discrete number of levels:
rasm.hvplot.contourf(
'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.PlateCarree(),
ylim=(0, 90), width=800, height=400, cmap='viridis', levels=10,
) * gv.feature.coastline
As you can see, hvPlot makes it simple to work with geographic data visually. For more complex plot types and additional details, see the GeoViews documentation.